64 research outputs found

    FABIAN-variant: predicting the effects of DNA variants on transcription factor binding.

    Get PDF
    While great advances in predicting the effects of coding variants have been made, the assessment of non-coding variants remains challenging. This is especially problematic for variants within promoter regions which can lead to over-expression of a gene or reduce or even abolish its expression. The binding of transcription factors to the DNA can be predicted using position weight matrices (PWMs). More recently, transcription factor flexible models (TFFMs) have been introduced and shown to be more accurate than PWMs. TFFMs are based on hidden Markov models and can account for complex positional dependencies. Our new web-based application FABIAN-variant uses 1224 TFFMs and 3790 PWMs to predict whether and to which degree DNA variants affect the binding of 1387 different human transcription factors. For each variant and transcription factor, the software combines the results of different models for a final prediction of the resulting binding-affinity change. The software is written in C++ for speed but variants can be entered through a web interface. Alternatively, a VCF file can be uploaded to assess variants identified by high-throughput sequencing. The search can be restricted to variants in the vicinity of candidate genes. FABIAN-variant is available freely at https://www.genecascade.org/fabian/

    Deep phenotyping: symptom annotation made simple with SAMS.

    Get PDF
    Precision medicine needs precise phenotypes. The Human Phenotype Ontology (HPO) uses clinical signs instead of diagnoses and has become the standard annotation for patients\u27 phenotypes when describing single gene disorders. Use of the HPO beyond human genetics is however still limited. With SAMS (Symptom Annotation Made Simple), we want to bring sign-based phenotyping to routine clinical care, to hospital patients as well as to outpatients. Our web-based application provides access to three widely used annotation systems: HPO, OMIM, Orphanet. Whilst data can be stored in our database, phenotypes can also be imported and exported as Global Alliance for Genomics and Health (GA4GH) Phenopackets without using the database. The web interface can easily be integrated into local databases, e.g. clinical information systems. SAMS offers users to share their data with others, empowering patients to record their own signs and symptoms (or those of their children) and thus provide their doctors with additional information. We think that our approach will lead to better characterised patients which is not only helpful for finding disease mutations but also to better understand the pathophysiology of diseases and to recruit patients for studies and clinical trials. SAMS is freely available at https://www.genecascade.org/SAMS/

    GeneDistiller—Distilling Candidate Genes from Linkage Intervals

    Get PDF
    Background: Linkage studies often yield intervals containing several hundred positional candidate genes. Different manual or automatic approaches exist for the determination of the gene most likely to cause the disease. While the manual search is very flexible and takes advantage of the researchers ’ background knowledge and intuition, it may be very cumbersome to collect and study the relevant data. Automatic solutions on the other hand usually focus on certain models, remain ‘‘black boxes’ ’ and do not offer the same degree of flexibility. Methodology: We have developed a web-based application that combines the advantages of both approaches. Information from various data sources such as gene-phenotype associations, gene expression patterns and protein-protein interactions was integrated into a central database. Researchers can select which information for the genes within a candidate interval or for single genes shall be displayed. Genes can also interactively be filtered, sorted and prioritised according to criteria derived from the background knowledge and preconception of the disease under scrutiny. Conclusions: GeneDistiller provides knowledge-driven, fully interactive and intuitive access to multiple data sources. It displays maximum relevant information, while saving the user from drowning in the flood of data. A typical query takes less than two seconds, thus allowing an interactive and explorative approach to the hunt for the candidate gene

    MutationDistiller: user-driven identification of pathogenic DNA variants

    Get PDF
    MutationDistiller is a freely available online tool for user-driven analyses of Whole Exome Sequencing data. It offers a user-friendly interface aimed at clinicians and researchers, who are not necessarily bioinformaticians. MutationDistiller combines Mutation- Taster’s pathogenicity predictions with a phenotypebased approach. Phenotypic information is not limited to symptoms included in the Human Phenotype Ontology (HPO), but may also comprise clinical diagnoses and the suspected mode of inheritance. The search can be restricted to lists of candidate genes (e.g. virtual gene panels) and by tissue-specific gene expression. The inclusion of GeneOntology (GO) and metabolic pathways facilitates the discovery of hitherto unknown disease genes. In a novel approach, we trained MutationDistiller’s HPO-based prioritization on authentic genotype–phenotype sets obtained from ClinVar and found it to match or outcompete current prioritization tools in terms of accuracy. In the output, the program provides a list of potential disease mutations ordered by the likelihood of the affected genes to cause the phenotype. MutationDistiller provides links to gene-related information from various resources. It has been extensively tested by clinicians and their suggestions have been valued in many iterative cycles of revisions. The tool, a comprehensive documentation and examples are freely available at https://www.mutationdistiller.org

    AutozygosityMapper: Identification of disease-mutations in consanguineous families

    Get PDF
    With the shift from SNP arrays to high-throughput sequencing, most researchers studying diseases in consanguineous families do not rely on linkage analysis any longer, but simply search for deleterious variants which are homozygous in all patients. AutozygosityMapper allows the fast and convenient identification of disease mutations in patients from consanguineous pedigrees by focussing on homozygous segments shared by all patients. Users can upload multi-sample VCF files, including WGS data, without any pre-processing. Genome-wide runs of homozygosity and the underlying genotypes are presented in graphical interfaces. AutozygosityMapper extends the functions of its predecessor. HomozygosityMapper, to the search for autozygous regions, in which all patients share the same homozygous genotype. We provide export of VCF files containing only the variants found in homozygous regions, this usually reduces the number of variants by two orders of magnitude. These regions can also directly be analysed with our disease mutation identification tool MutationDistiller. The application comes with simple and intuitive graphical interfaces for data upload, analysis, and results. We kept the structure of HomozygosityMapper so that previous users will find it easy to switch. With AutozygosityMapper, we provide a fast web-based way to identify disease mutations in consanguineous families. AutozygosityMapper is freely available at https://www.genecascade. org/AutozygosityMapper/

    FragIdent – Automatic identification and characterisation of cDNA-fragments

    Get PDF
    BACKGROUND: Many genetic studies and functional assays are based on cDNA fragments. After the generation of cDNA fragments from an mRNA sample, their content is at first unknown and must be assigned by sequencing reactions or hybridisation experiments. Even in characterised libraries, a considerable number of clones are wrongly annotated. Furthermore, mix-ups can happen in the laboratory. It is therefore essential to the relevance of experimental results to confirm or determine the identity of the employed cDNA fragments. However, the manual approach for the characterisation of these fragments using BLAST web interfaces is not suited for larger number of sequences and so far, no user-friendly software is publicly available. RESULTS: Here we present the development of FragIdent, an application for the automatic identification of open reading frames (ORFs) within cDNA-fragments. The software performs BLAST analyses to identify the genes represented by the sequences and suggests primers to complete the sequencing of the whole insert. Gene-specific information as well as the protein domains encoded by the cDNA fragment are retrieved from Internet-based databases and included in the output. The application features an intuitive graphical interface and is designed for researchers without any bioinformatics skills. It is suited for projects comprising up to several hundred different clones. CONCLUSION: We used FragIdent to identify 84 cDNA clones from a yeast two-hybrid experiment. Furthermore, we identified 131 protein domains within our analysed clones. The source code is freely available from our homepage at

    Aviator: a web service for monitoring the availability of web services

    Get PDF
    With Aviator, we present a web service and repository that facilitates surveillance of online tools. Aviator consists of a user-friendly website and two modules, a literature-mining based general and a manually curated module. The general module currently checks 9417 websites twice a day with respect to their availability and stores many features (frontend and backend response time, required RAM and size of the web page, security certificates, analytic tools and trackers embedded in the webpage and others) in a data warehouse. Aviator is also equipped with an analysis functionality, for example authors can check and evaluate the availability of their own tools or those of their peers. Likewise, users can check the availability of a certain tool they intend to use in research or teaching to avoid including unstable tools. The curated section of Aviator offers additional services. We provide API snippets for common programming languages (Perl, PHP, Python, JavaScript) as well as an OpenAPI documentation for embedding in the backend of own web services for an automatic test of their function. We query the respective APIs twice a day and send automated notifications in case of an unexpected result. Naturally, the same analysis functionality as for the literature-based module is available for the curated section. Aviator can freely be used at https://www.ccb.uni-saarland.de/aviator

    MutationTaster2021

    Get PDF
    Here we present an update to MutationTaster, our DNA variant effect prediction tool. The new version uses a different prediction model and attains higher accuracy than its predecessor, especially for rare benign variants. In addition, we have integrated many sources of data that only became available after the last release (such as gnomAD and ExAC pLI scores) and changed the splice site prediction model. To more easily assess the relevance of detected known disease mutations to the clinical phenotype of the patient, MutationTaster now provides information on the diseases they cause. Further changes represent a major overhaul of the interfaces to increase user-friendliness whilst many changes under the hood have been designed to accelerate the processing of uploaded VCF files. We also offer an API for the rapid automated query of smaller numbers of variants from within other software. MutationTaster2021 integrates our disease mutation search engine, MutationDistiller, to prioritise variants from VCF files using the patient's clinical phenotype. The novel version is available at https://www.genecascade.org/MutationTaster2021/. This website is free and open to all users and there is no login requirement

    d-matrix – database exploration, visualization and analysis

    Get PDF
    BACKGROUND: Motivated by a biomedical database set up by our group, we aimed to develop a generic database front-end with embedded knowledge discovery and analysis features. A major focus was the human-oriented representation of the data and the enabling of a closed circle of data query, exploration, visualization and analysis. RESULTS: We introduce a non-task-specific database front-end with a new visualization strategy and built-in analysis features, so called d-matrix. d-matrix is web-based and compatible with a broad range of database management systems. The graphical outcome consists of boxes whose colors show the quality of the underlying information and, as the name suggests, they are arranged in matrices. The granularity of the data display allows consequent drill-down. Furthermore, d-matrix offers context-sensitive categorization, hierarchical sorting and statistical analysis. CONCLUSIONS: d-matrix enables data mining, with a high level of interactivity between humans and computer as a primary factor. We believe that the presented strategy can be very effective in general and especially useful for the integration of distinct data types such as phenotypical and molecular data

    Systematic Comparison of Three Methods for Fragmentation of Long-Range PCR Products for Next Generation Sequencing

    Get PDF
    Next Generation Sequencing (NGS) technologies are gaining importance in the routine clinical diagnostic setting. It is thus desirable to simplify the workflow for high-throughput diagnostics. Fragmentation of DNA is a crucial step for preparation of template libraries and various methods are currently known. Here we evaluated the performance of nebulization, sonication and random enzymatic digestion of long-range PCR products on the results of NGS. All three methods produced high-quality sequencing libraries for the 454 platform. However, if long-range PCR products of different length were pooled equimolarly, sequence coverage drastically dropped for fragments below 3,000 bp. All three methods performed equally well with regard to overall sequence quality (PHRED) and read length. Enzymatic fragmentation showed highest consistency between three library preparations but performed slightly worse than sonication and nebulization with regard to insertions/deletions in the raw sequence reads. After filtering for homopolymer errors, enzymatic fragmentation performed best if compared to the results of classic Sanger sequencing. As the overall performance of all three methods was equal with only minor differences, a fragmentation method can be chosen solely according to lab facilities, feasibility and experimental design
    • …
    corecore